2D Plots
Simple Plots
You can plot just about any vector data in R by simply passing the data as parameters to the "plot()" function. Try some of the following and then your own plots.
x=1:20 # create a simple sequence plot(x) # plot it
You can also create a scatter gram between two vectors. You only need to make sure the vectors have excatly the same number of entries.
x=1:20 y=x*x # creates a vector with x^2 exponential values plot(x,y) # plots the x agianst y values
If you pass a function and then start and end values, plot() will show you that function executed for the range of values.
plot(qnorm) # quantiles of the normal distribution plot(sin, -pi, 2*pi) # see ?plot.function
Adding Labels
Graphs really need to have at least a title and labels on the axis. You can add the parameters below to the plot() function to change the default labels.
main="Main Title" xlab="X Axis Label" ylab="Y Axis Label"
You can change the type of graph with the "type" parameter:
Type | Description |
p | Points |
l (an "el") | Lines |
b | Points and lines |
Stylizing the Data
You can also specify the color of the data with "col". Examples includes:
col="red"
col="blue"
You can use the same hexadecimal format as used with HTML. The format is "#RRGGBB" where RR, GG, and BB are hexadecimal values between 00 and FF.
plot(sin, 0, 2*pi,col="red",xlab="Independent Variable",ylab="Sine Values",main="Sine Function")
You can change the shapes that are used to plot data with:
- pch = 19: solid circle,
- pch = 20: bullet (smaller solid circle, 2/3 the size of 19),
- pch = 21: filled circle,
- pch = 22: filled square,
- pch = 23: filled diamond,
- pch = 24: filled triangle point-up,
- pch = 25: filled triangle point down.
Contributed by: Danielle Jones
Box Plots
Box plots show information about the distribtuion of values between categories of data. The code below produced the box plot just below it. R will automatically find categories within the predictor variable. The box plot will then show the:
- Min: the minimum value of all the values for that category
- Max: the maximum value of all the values for that category
- Median: the middle of the range of values (i.e. half the values will be above and half below)
- Upper quartile: This box contains the first 1/4 of the data that is above the median
- Upper quartile: This box contains the first 1/4 of the data that is below the median
- Outliers: Values that are statistically outside the dominant distrubtion of the data.
boxplot(TheData$AnnualPrecip~TheData$Present, main="Annual Preciptation vs. Presence", xlab="Presence", ylab="Height")
Below is a box plot from the "boxplot()" function in R with annotations for the plot elements.
The figure below shows how quartiles are related to standard deviation in a normal curve.
Wikipedia, 2014
Plotting Sorted Data to Check the Overall Distribution
You can sort a vector and then plot it's values. Then, you can overlay a straight line to see how much the data deviates from a straight line.
plot(sort(elev)) # to see elevation distribution lines(c(1,160),range(elev),col=2) # to overlay a straight line of perfect
Plotting Models and Data
The function below will plot a set of data and a model with confidence intervals for many of the modeling approaches described on this web site.
######################################################################################### # Creates plots of original data, modeled data, and confidence intervals for one independent # variable plotted against the response variable for an existing model. # # Parameters # - TheModel - lm, gam, glm, and potentially other models # - ResponseName - a string containing the name of the response variable used to create the model # - Independent - a string containing the name of the independent variable used in the creation of the model # - TheData - Original data used to create the model # - xlab: Optional parameter to replace the ResponseName as the x-axis label # - ylab: Optional parameter to replace the ResponseName as the y-axis label ######################################################################################### CombinedModelPlot=function(TheModel,ResponseName,IndependentName,TheData,xlab="",ylab="") { if (xlab=="") xlab=IndependentName if (ylab=="") ylab=ResponseName Response=TheData[[ResponseName]] Independent=TheData[[IndependentName]] # print(Independent) # Create a sequence that goes over the entire predictor variable range UniformPrecip = seq(min(Independent), max(Independent), length.out=100) NewData=data.frame(IndependentName=UniformPrecip) # This sets the column name to "IndependentName" colnames(NewData) <- c(IndependentName) # set the name of the column to match the name in TheModel ThePredictions = predict(TheModel, newdata = NewData, type="response") #response <- predict(TheModel, newdata = data.frame(Precip=newx), interval = 'response') ThePrediction=predict(TheModel,newdata=TheData,type="response") # create the prediction TheStdErr=predict(TheModel,newdata = NewData,se=TRUE) # create the prediction #plot(UniformPrecip,ThePredictions,xlab=xlab,ylab=ylab) FinalDataFrame=data.frame(UniformPrecip,TheStdErr[1],ThePredictions[2]) # Setup the chart area (jjg - use min/max?) plot(Independent,Response,xlab=xlab,ylab=ylab) # Plot the original data UpperCI <- ThePredictions + (2 * TheStdErr$se.fit) LowerCI <- ThePredictions - (2 * TheStdErr$se.fit) # Plot the polygon by going left to right along the top of the polygon # and then right to left along the bottom polygon(c(UniformPrecip, rev(UniformPrecip)), c(UpperCI, rev(LowerCI)), col = 'grey80', border = NA) lines(UniformPrecip, ThePredictions, col = 'black') # plot the response # compute the upper and lower confidence intervals lines(UniformPrecip, UpperCI, lty = 'dashed', col = 'black') lines(UniformPrecip, LowerCI, lty = 'dashed', col = 'black') # Original points points(Independent,Response) # Plot the original data }
Other Resources
Simple Plot from College of the Redwoods